sentence | adverb | verb | gramm |
---|---|---|---|
A la salida del trabajo, **ayer** las chicas **compraron** pan en la tienda.<br> *After leaving work* **yesterday** *the girls* **bought** *bread at the shop* | past | past | gramm |
A la salida del trabajo, **ayer** las chicas **\*comprarán** pan en la tienda.<br> *After leaving work* **yesterday** *the girls* **\*will buy** *bread at the shop* | past | future | ungramm |
A la salida del trabajo, **mañana** las chicas **comprarán** pan en la tienda.<br> *After leaving work* **tomorrow** *the girls* **will buy** *bread at the shop* | future | future | gramm |
A la salida del trabajo, **mañana** las chicas **\*compraron** pan en la tienda.<br> *After leaving work* **tomorrow** *the girls* **\*bought** *bread at the shop* | future | past | ungramm |
14 Report 2
(Generalised) linear mixed models
The goal of this report is to review and consolidate what we learned together in the second block of the course. You are not required to do anything that we have not already seen.
For students enrolled in this course in the Winter Semester 2023/24: The report is due March 29, 2024 at 11:59pm. Please submit your Quarto script, as well as a rendered copy in HTML and PDF to Moodle (under ‘Reports’).
14.1 Dataset
For this report you will continue using the data from Biondo et al. (2022), an eye-tracking reading study on adverb-tense congruence effects on reading time measures. Participants’ eye movements were recorded as they read Spanish sentences where temporal adverbs and verb tense were either congruent or incongruent. For both sentence regions, the time reference was either past (e.g., yesterday, bought) or future (e.g., tomorrow, will buy). Example stimuli from this experiment are given in Table 14.1.
You will be fitting models to different eye-tracking reading measures from this experiment, with the predictors adverb time and grammaticality.
14.2 Set-up
Make sure you begin with a clear working environment. To achieve this, you can go to Session > Restart R
. Your Environment should have no objects in it, and you should not have any packages loaded.
14.2.1 Quarto YAML
Make sure your YAML looks something like this:
---
: "Report 2"
title: "My Name"
name:
format: default
html: default
pdf: true
toc-sections: true
number---
I suggest you render your document frequently, e.g., after every substantial code chunk/task achieved. This will ensure earlier detection of broken code and makes it easier to fix problems. Do this for both HTML and PDF.
14.2.2 Packages
Load the following packages, however you prefer (i.e., you don’t have to use pacman::p_load()
):
- tidyverse
- janitor
- here
- broom.mixed
- lattice
- lme4
- lmerTest
Describe what each of the following packages is used for (in our experience, they have many more useful functions than we’ve tried).
broom.mixed
:lattice
:lme4
:lmerTest
:
14.2.3 Data
Load in the Biondo et al. (2022) data by running the following code chunk.
<-
df_biondo read_csv(here("data", "Biondo.Soilemezidi.Mancini_dataset_ET.csv"),
locale = locale(encoding = "Latin1") ## for special characters in Spanish
|>
) clean_names() |>
mutate(gramm = ifelse(gramm == "0", "ungramm", "gramm")) |>
mutate_if(is.character,as_factor) |> # all character variables as factors
filter(adv_type == "Deic") |>
droplevels() |>
mutate(
roi_length = str_length(label)
|>
) relocate(roi_length, .after = label)
The last few lines add a new variable (roi_length
) that contains region length (in letters). We will use this as a covariate in one of our models.
14.3 Model set up
You will be asked to run two models, one linear mixed model (lmer()
from the lme4
or lmerTest
package) and one genearlised (logistic) linear mixed model (glmer(family = "binomial")
from the lme4
package).
14.3.1 Variable transformations
For each model, consider whether you need to implement the following steps:
- centre (sum contrast code) categorical predictors
- standardize continuous predictors (e.g., using the
scale()
function) - log-transform continuous dependent variables if skewed
- model selection: begin with a maximal model
- simplify in case of nonconvergence or singular fit
14.3.2 Model selection
For each model, start with a “maximal” model justified by the design. If you encounter convergence issues, begin by first implementing “unintrusive” remedies. If you still have convergence issues (as indicated by warning messages and/or e.g., inspecting the variance-covariance matrix), reduce the random effects structure as you see fit. Be sure to document and justify your decisions step-by-step. N.B., the equivalent of lmerControl
argument (for lmer()
models) is glmerControl
for glmer()
models.
If you choose to use the lme4::allFit()
function, beware that it can take a long time to run, especially on ‘maximal’ models. I suggest you (i) save the output as an object (e.g., allFit_model1 <- allFit(model1)
) and (ii) plan another task that doesn’t involve running code when you run this function.
I am not expecting any particular model/random effects structure that is correct, but am looking for explanations on how you made decisions regarding what to remove or keep in your model.
14.4 Linear mixed model
Fit a linear mixed model to total reading times (tt
) at the adverb region (roi == 2
). Your fixed effects are adverb time reference (adv_t
), grammaticality (gramm
), their interaction, and (standardized) region length in characters as a covariate without any interaction. Include by-participant and -item random effects.
14.4.1 Fit a model
Start by defining your most maximal model justified by your design, and simplify accordingly. Remember to not delete the code for nonconverging models, instead set the code chunk to not run when you render your document, as in the code chunk below (#| eval: false
).
```{r}
#| eval: false
<-
fit_some_maximal_model lmer(dependent_variable ~ predictor1*predictor2 + covariate +
1 + predictor1*predictor2|participant) +
(data = my_data,
subset = some_factor == "some_level")
# informative comment, e.g., "didn't converge"
```
14.4.2 Report results
Once you’ve landed on a final model that converges, inspect the fixed and random effects (some useful functions we’ve already seen: summary()
, broom.mixed::tidy()
, fixef()
, ranef()
, coef()
, lattice::dotplot()
).
14.5 Generalised linear mixed model
We didn’t cover how to implement logistic mixed regression, however the relationship between lm()
and glm()
is the same in mixed models (lmer()
and glmer()
).
14.5.1 Fit a model
Fit a generalised linear mixed model (glmer()
from the lme4
package, lmerTest
does not have this function) to the regressions in (ri
) to the adverb region (roi == 2
). Your fixed effects are adverb time reference (adv_t
), grammaticality (gramm
), and their interaction. Remember to use eval: false
in your code chunk options to stop Quarto from running all your non-final models when rendering.
14.5.2 Report results
Once you’ve landed on a final model that converges, inspect the fixed and random effects (some useful functions we’ve already seen: summary()
, broom.mixed::tidy()
, fixef()
, ranef()
, coef()
, lattice::dotplot()
)
Recall that our coefficient estimates are in log odds. The interpretation of your coefficient estimates (fixed effects) is identical to that in genearlised linear models (i.e., without random effects).
14.6 Interpretation
Write a short report of the findings from the two models. Produce a table and plot like in the example above to supplement your report.
14.7 Render
Render your Quarto finished script. Upload the .qmd
, .pdf
, and .html
files to Moodle. N.B., you need to have tinytex
installed to be able to render PDFs.